20 research outputs found

    Editorial of the special issue on latest advancements in linguistic linked data

    Get PDF
    Since the inception of the Open Linguistics Working Group in 2010, there have been numerous efforts in transforming language resources into Linked Data. The research field of Linguistic Linked Data (LLD) has gained in importance, visibility and impact, with the Linguistic Linked Open Data (LLOD) cloud gathering nowadays over 200 resources. With this increasing growth, new challenges have emerged concerning particular domain and task applications, quality dimensions, and linguistic features to take into account. This special issue aims to review and summarize the progress and status of LLD research in recent years, as well as to offer an understanding of the challenges ahead of the field for the years to come. The papers in this issue indicate that there are still aspects to address for a wider community adoption of LLD, as well as a lack of resources for specific tasks and (interdisciplinary) domains. Likewise, the integration of LLD resources into Natural Language Processing (NLP) architectures and the search for long-term infrastructure solutions to host LLD resources continue to be essential points to which to attend in the foreseeable future of the research line

    Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling

    Full text link
    This paper presents a Kernel Entity Salience Model (KESM) that improves text understanding and retrieval by better estimating entity salience (importance) in documents. KESM represents entities by knowledge enriched distributed representations, models the interactions between entities and words by kernels, and combines the kernel scores to estimate entity salience. The whole model is learned end-to-end using entity salience labels. The salience model also improves ad hoc search accuracy, providing effective ranking features by modeling the salience of query entities in candidate documents. Our experiments on two entity salience corpora and two TREC ad hoc search datasets demonstrate the effectiveness of KESM over frequency-based and feature-based methods. We also provide examples showing how KESM conveys its text understanding ability learned from entity salience to search

    Language resources and linked data: a practical perspective

    Full text link
    Recently, experts and practitioners in language resources have started recognizing the benefits of the linked data (LD) paradigm for the representation and exploitation of linguistic data on the Web. The adoption of the LD principles is leading to an emerging ecosystem of multilingual open resources that conform to the Linguistic Linked Open Data Cloud, in which datasets of linguistic data are interconnected and represented following common vocabularies, which facilitates linguistic information discovery, integration and access. In order to contribute to this initiative, this paper summarizes several key aspects of the representation of linguistic information as linked data from a practical perspective. The main goal of this document is to provide the basic ideas and tools for migrating language resources (lexicons, corpora, etc.) as LD on the Web and to develop some useful NLP tasks with them (e.g., word sense disambiguation). Such material was the basis of a tutorial imparted at the EKAW’14 conference, which is also reported in the paper

    A survey of guidelines and best practices for the generation, interlinking, publication, and validation of linguistic linked data

    Get PDF
    This article discusses a survey carried out within the NexusLinguarum COST Action which aimed to give an overview of existing guidelines (GLs) and best practices (BPs) in linguistic linked data. In particular it focused on four core tasks in the production/publication of linked data: generation, interlinking, publication, and validation. We discuss the importance of GLs and BPs for LLD before describing the survey and its results in full. Finally we offer a number of directions for future work in order to address the findings of the survey

    Cross-Lingual Link Discovery for Under-Resourced Languages

    Get PDF
    CC BY-NC 4.0In this paper, we provide an overview of current technologies for cross-lingual link discovery, and we discuss challenges, experiences and prospects of their application to under-resourced languages. We first introduce the goals of cross-lingual linking and associated technologies, and in particular, the role that the Linked Data paradigm (Bizer et al., 2011) applied to language data can play in this context. We define under-resourced languages with a specific focus on languages actively used on the internet, i.e., languages with a digitally versatile speaker community, but limited support in terms of language technology. We argue that languages for which considerable amounts of textual data and (at least) a bilingual word list are available, techniques for cross-lingual linking can be readily applied, and that these enable the implementation of downstream applications for under-resourced languages via the localisation and adaptation of existing technologies and resources

    DEVELOPING MASHUP APPLICATIONS USING EMML

    Get PDF
    V diplomskem delu podrobno predstavimo podjetniške sestavljanke in jezik EMML. Obdelamo arhitekturo podjetniških sestavljank za lažje identificiranje izzivov, ki jih le-te prinašajo, in izpostavimo potrebo po vpeljavi podjetniških sestavljank v podjetjih. Sledi podroben opis jedra jezika EMML kot standarda za razvoj podjetniških sestavljank. Izpostavimo prednosti, ki jih jezik EMML prinaša, ter identificiramo morebitne ovire. Razvoj sestavljank z jezikom EMML prikažemo na praktičnem primeru z izdelavo sestavljank za nadzor mednarodne izmenjave študentov.In this final work we present in details the enterprise mashups and the Enterprise Mashup Markup Language. We go through the enterprise mashups architecture for easier identificaton of the challenges they bring and we stress the need for implementation of enterprise mashups in enterprises. After that follows a detailed description of the core of the EMML as an enterprise mashup development standard. We present the advantages the EMML language brings and we identify possible obstacles. Developing enterprise mashups using EMML is presented on a practical case with development of a mashups for supervision of international student exchanges
    corecore